Efficient estimation of graphlet frequency distributions in protein-protein interaction networks
نویسندگان
چکیده
MOTIVATION Algorithmic and modeling advances in the area of protein-protein interaction (PPI) network analysis could contribute to the understanding of biological processes. Local structure of networks can be measured by the frequency distribution of graphlets, small connected non-isomorphic induced subgraphs. This measure of local structure has been used to show that high-confidence PPI networks have local structure of geometric random graphs. Finding graphlets exhaustively in a large network is computationally intensive. More complete PPI networks, as well as PPI networks of higher organisms, will thus require efficient heuristic approaches. RESULTS We propose two efficient and scalable heuristics for finding graphlets in high-confidence PPI networks. We show that both PPI and their model geometric random networks, have defined boundaries that are sparser than the 'inner parts' of the networks. In addition, these networks exhibit 'uniformity' of local structure inside the networks. Our first heuristic exploits these two structural properties of PPI and geometric random networks to find good estimates of graphlet frequency distributions in these networks up to 690 times faster than the exhaustive searches. Our second heuristic is a variant of a more standard sampling technique and it produces accurate approximate results up to 377 times faster than the exhaustive searches. We indicate how the combination of these approaches may result in an even better heuristic. AVAILABILITY Supplementary information is available at http://www.cs.toronto.edu/~natasha/BIOINF-2005-0946/Supplementary.pdf. Software implementing the algorithms is available at http://www.cs.toronto.edu/~natasha/BIOINF-2005-0946/estimate_grap-hlets.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Construction and Analysis of Tissue-Specific Protein-Protein Interaction Networks in Humans
We have studied the changes in protein-protein interaction network of 38 different tissues of the human body. 123 gene expression samples from these tissues were used to construct human protein-protein interaction network. This network is then pruned using the gene expression samples of each tissue to construct different protein-protein interaction networks corresponding to different studied ti...
متن کاملComparison of Hubs in Effective Normal and Tumor Protein Interaction Networks
ABSTRACTIntroduction: Cancer is caused by genetic abnormalities, such as mutation of ontogenesis or tumor suppressor genes which alter downstream signaling pathways and protein-protein interactions. Comparison of protein interactions in cancerous and normal cells can be of help in mechanisms of disease diagnoses and treatments. Methods: We constructed protein interaction networks of cancerous a...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملStudy of PKA binding sites in cAMP-signaling pathway using structural protein-protein interaction networks
Backgroud: Protein-protein interaction, plays a key role in signal transduction in signaling pathways. Different approaches are used for prediction of these interactions including experimental and computational approaches. In conventional node-edge protein-protein interaction networks, we can only see which proteins interact but ‘structural networks’ show us how these proteins inter...
متن کاملGraphlet-based measures are suitable for biological network comparison
MOTIVATION Large amounts of biological network data exist for many species. Analogous to sequence comparison, network comparison aims to provide biological insight. Graphlet-based methods are proving to be useful in this respect. Recently some doubt has arisen concerning the applicability of graphlet-based measures to low edge density networks-in particular that the methods are 'unstable'-and f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 22 8 شماره
صفحات -
تاریخ انتشار 2006